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Abstract 

We propose an infinitesimal dispersion index for Markov counting processes. We show that, under standard moment 
existence conditions, a process is infinitesimally (over-) equi-dispersed if, and only if, it is simple (compound), i.e. 
it increases in jumps of one (or more) unit(s), even though infinitesimally equi-dispersed processes might be under-, 
equi- or over-dispersed using previously studied indices. Compound processes arise, for example, when introducing 
continuous-time white noise to the rates of simple processes resulting in Levy-driven SDEs. We construct multivariate 
infinitesimally over-dispersed compartment models and queuing networks, suitable for applications where moment 
constraints inherent to simple processes do not hold. 

Keywords: continuous time; counting Markov process; birth-death process; environmental stochasticity; 
infinitesimal over-dispersion; simultaneous events 



1. Introduction 

Continuous-time stochastic processes are widely used as a modeling tool for studying dynamical systems in differ- 
ent fields. Most continuous-time processes proposed in the literature belong to one of two large families: real-valued 
processes which can be written as solutions to stochastic differential equations [20, 27] and discrete-valued processes 
defined via counting processes [9, 30, 8] or Markov chains [4]. In this paper, we focus on the intersection between 
counting processes and Markov processes, namely Markov counting processes (MCPs from this point onward). MCPs 
are building blocks for models which are heavily used in biology (in the context of compartment models) and engi- 
neering (in the context of queues and queuing networks) as well as in many other fields. 

A counting process is a continuous-time, non-decreasing, non-negative, integer-valued stochastic process. The 
counting process is said to count events each of which has an associated event time. A counting process is simple 
if, with probability one, there is no time at which two or more events occur simultaneously. A process which is not 
simple is called compound. Simpleness is a convenient, and therefore widely adopted, property for both the theory 
and applications of counting processes [9]. The Markov property is also a convenient and widespread property of 
stochastic models. However, we will show that simple MCPs, combining these two attractive properties, have severe 
limitations in terms of the range of possible relationships between their infinitesimal mean and variance. Previous 
approaches to negotiate this difficulty have centered on sacrificing the Markov property rather than simpleness. How- 
ever, there are theoretical and practical attractions to the alternative strategy of maintaining the Markov property while 
allowing for simultaneous events. Investigating such models is the topic of this paper 

The ratio of the variance to the mean of a random variable is called its dispersion. Many well-known integer- 
valued distributions have dispersion constraints. These constraints are often not reproduced in data from applications. 
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the data typically having additional variance and therefore being termed over-dispersed [26]. The same issues arise 
in integer-valued stochastic processes [6] and, as a result, there is a considerable literature devoted to extending oth- 
erwise appealing models which are unable to reproduce observed variability. Typically, over-dispersion has been 
studied via defining stochastic processes in which some parameters are themselves modeled as stochastic in order to 
produce additional variability. This idea has been widely applied since the pioneering work of Greenwood and Yule 
[14], which derived the over-dispersed negative binomial distribution as a mixture of the Poisson distribution with 
a ganmia-distributed parameter. Another early contribution is the Cox process [7], also known as doubly- stochastic 
Poisson process [8, 30, 9]. Some recent work has considered stochastic parameters for continuous-time Markov chains 
[10] and for non-Markovian processes [32]. Marion and Renshaw [24] and Varughese and Fatti [33] studied over- 
dispersion generated by standard birth-death processes with diffusion-driven rates, focusing on population dynamics 
applications. Both [24] and [33] proposed a mean-reverting Ornstein-Uhlenbeck process for the driving random en- 
vironment. Compound counting processes have been studied in the literature on batch processes [28 1, but we are not 
aware of a previous investigation of infinitesimal dispersion in this context. To our knowledge, the first general class of 
infinitesimally over-dispersed MCPs was proposed by Breto et al. [5]. They achieved over-dispersion by introducing 
white noise to rates of a multivariate process constructed via simple death processes, which was shown to result in 
the possibility of simultaneous events. The main goal of this paper is to generalize the model of [5] by presenting a 
systematic investigation of over-dispersed models via compound MCPs. In particular, those defined by Levy-driven 
stochastic differential equations [2] resulting from introducing continuous-time white noise in the rate of simple MCPs 
via Kohnogorov's differential equations. The applications of MCPs are too diverse to cover systematically here. One 
concrete example, which has been a motivation for our work [5], is the study of infectious disease dynamics. Discrete- 
state Markov processes have proven useful models for studying many infectious disease transmission systems, and are 
central to current understanding of the spread of such diseases through populations [21]. However, standard disease 
models are constructed via simple MCPs and therefore struggle to match the statistical properties observed in data. 
Recent advances in statistical inference methodology [18, 1] have permitted fitting more general models, based on 
compound MCPs, to data [5, 16]. At least in this context, the substantial scientific consequences of adequately mod- 
eling over-dispersion in stochastic processes are consistent with the widely recognized importance of over-dispersion 
for drawing correct inferences from integer- valued regression models [26]. 

As concrete examples of models defined by Levy-driven Kolmogorov's differential equations, we compute in- 
finitesimal moments and infinitesimal probabihties for various specific novel models. The availability of infinitesimal 
probabihties makes possible exact simulation, and exact methods are particularly appropriate when dealing with small 
counts, which arise naturally in some applications. In infectious disease apphcations, for example, small counts arise 
at the start of an epidemic, which is a critical period for identifying and controlhng the disease transmission. Exact 
simulation of MCPs can be computationally demanding for processes with a very large number of events. In this 
cases, it is standard to use approximations which are more affordable computationally but require some diagnostics 
to investigate the vaUdity of the approximation. To this end, both Euler-Maruyama time discretizations of MCPs and 
diffusion approximations have been proposed in the Uterature [5, 16, 18, 22, 24, 33, 1 1]. Several algorithms have been 
proposed in which two simulation methods are used, an exact one for small counts and a faster, approximate one for 
larger counts [15]. In order to use combined algorithms of this type, it is necessary to choose a diflxision approxima- 
tion, given some MCP. Diffusions are defined in a straightforward way in terms of infinitesimal moments. Requiring 
that the MCP and the proposed diffusion approximation have common infinitesimal moments gives a natural approach 
for such an approximation, giving further motivation for the study of infinitesimal moments of MCPs. 

A second goal of this paper is to propose the use of an infinitesimal dispersion index for counting processes in 
conjunction with standard indices. This provides a simple measure of dispersion, combining attractive theoretical 
properties with scientific interpretabihty, which is desirable when considering candidate processes for applications. 
Markov processes specified as the solution to stochastic differential equations are naturally characterized by their 
infinitesimal mean and variance [20]. However, these infinitesimal moments have not been studied in the context 
of counting processes, perhaps because, as we will show, in the case of simple MCPs the infinitesimal variance is 
constrained to be equal to the infinitesimal mean. Instead, interest has focused on dispersion properties of increments 
of counting processes over fixed time windows, which we call integrated dispersion to distinguish it from infinitesimal 
dispersion. The study of integrally over-dispersed counting processes has a long history, going back at least to the 
start of the twentieth century [31] and continuing up to the present [e.g., 3]. Integrated dispersion has undoubtedly 
an interest of its own, in particular if the integration window is chosen according to some specific criterion (possibly 
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motivated in applications by scientific evidence). Because of this window dependence, integrated dispersion may 
give a distorted representation of a process, in the same way that discretizing a continuous-time process at different 
resolutions might give very different pictures. In particular, we show that all of integrated over-, equi- and under- 
dispersion may occur for infinitesimally equi-dispersed processes. By contrast, infinitesimal dispersion provides an 
intuitive and theoretically attractive measure which has already proven its worth in the study of real- valued Markov 
processes. 

In Section 2 we investigate the infinitesimal moments of simple and compound MCPs, and compare them with 
previously studied measures of integrated dispersion. Then, in Section 3 we propose several novel over-dispersed 
compound MCPs. In Section 4, we define multivariate versions of the dispersion indices of Section 2 and find 
suflicient and necessary conditions for infinitesimal equi- and over-dispersion. Finally, in Section 5 we show how 
the univariate MCPs of Section 3 may be used as building blocks for more complex processes, such as compartment 
models or queuing networks, which inherit the desired dispersion properties. We conclude with Section 6, where we 
discuss some conceptual and practical issues in modeling via compound MCPs. 

2. Dispersion of Markov counting processes 

One can study dispersion in the context of non-Markovian processes, but several considerations have led us to 
focus on the Markov case here. Firstly, there is less room for debate over the definition of appropriate measures of 
dispersion for Markov processes. Secondly, the extensively studied theory of Markov chains [4] allows us to avoid 
explicitly discussing measure-theoretic issues while being guaranteed that there are no difficulties concerning the 
existence and construction of the processes in question. Thirdly, our later goal of studying over-dispersed Markov 
counting processes clearly does not necessitate a complete investigation of non-Markovian possibilities. We conmnent 
on some non-Markovian situations in Section 2.1. 

Let [Nit) : t € R^} be a time homogeneous Markov counting process, which we will refer to as [Nit)]. By 
analogy with the terminology of infinitesimal and integrated moments of Section 1, we define infinitesimal increment 
probabilities (or just infinitesimal probabilities) of an MCP to be 

q{n,k) = am ; . (1) 

HQ h 

These are also commonly referred to as the local characteristics of the transition semigroup, or the infinitesimal 
generator of the corresponding MCP [4]. To clarify our notation, note that (1) are not actual probabilities but rather 
the appropriate Umit of the integrated probabilities P{AN{t) = k\Nit) = n). Here t,h ^W" and k,n &'H with k >\. 
The operator A acting on a stochastic process is defined as KNif) - N(t + h)- N(t) and the dependence of AA^(f) on h 
is suppressed. Following standard terminology for counting processes, we define the intensity or ( infinitesimal) rate 
function of such a process to be 

, 1 - P{KN{t) = 0\N(t) = n) 

A(n) = lua ; . 

hio h 

Note that in definition 1 we have allowed for simultaneous events, i.e. {N(t)] need not be simple. Simple processes 
may be fully specified via their rate function, which in this case is A{n) = q{n, 1), and this is also a measure of the 
intensity at which events occur. A counting process jumps whenever there is an event, and we call the times at which 
there is one or more event jump times. The jumps are of size one if the process is simple and might be of greater 
size if the process is compound. We emphasize the difi'erence between jump times and event times because these two 
concepts overlap in the specific case of simple counting processes but are in general distinct. To specify a compound 
processes one needs to provide all the infinitesimal probabiUties, since the rate A{n) corresponds only to the rate of 
jumps and is uninformative about the distribution of jump sizes. For such processes, the infinitesimal mean may be a 
superior measure of the intensity at which events occur. 

We restrict ourselves to stable and conservative processes for which A(n) - Yjk>i k) < oo for all n. Markov 
processes satisfying these conditions form a very general class, and the MCP is then characterized by its infinitesimal 
probabilities [4]. We also restrict ourselves to time homogeneous processes to add clarity to the concepts, results 
and proofs. However, these can be readily generaUzed to the non-homogeneous case, for which the infinitesimal 
probabilities also depend on time. 
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Measures of dispersion which have previously been considered for counting processes include the variance to 
mean ratio V\N(f)]l E[N(f)] (for example in [13]) and the difference V[N{f)\ - E[N{f)\ (in [6]). We will define the 
integrated dispersion index of [N(t)} as 

""'^"^''^ " Em)-NmNiO) = noy 

Usually no is assumed to be in which case corresponds to the standard dispersion index defined as a ratio. Note 
however that (1) defines {N{t)] in infinitesimal terms. This suggests the infinitesimal dispersion index which we define 
as 

lim,4o h-'V[N(t + h)- N{t)\N{t) = «1 (t]^ 

lJdN\n) = ; = , (j) 

limftio h-'E[N(t + h)- N{t)\N{t) ^ n\ h^n 

as an alternative to D^f- The numerator and denominator of (3) are the standard definitions of infinitesimal variance 
and infinitesimal mean respectively [20]. Note that these two moments are conditional and that dependence of the 
infinitesimal moments on n is suppressed. By the algebraic properties of limits, DdNino) = lim,|o D^ino, t) as long as 
the limit of the denominator of (3) exists. A process has traditionally been considered over-dispersed when V[N{t)\ > 
E\N{t)]. Analogously we define a process as infinitesimally (integrally) over-dispersed if D^n > HD^ > 1), for all t 
and n, and define under- and equi-dispersion accordingly. For some processes, these conditions might not hold for all 
t or all n. In this case we specify the subsets for which they hold. Since we focus on infinitesimal properties, we will 
drop in the rest of the paper the term infinitesimal in order to simplify notation. If we do not specify whether we refer 
to infinitesimal or integrated moments or dispersion, it should be understood that we mean the former 

In light of definition 3, it is interesting to find necessary and sufficient conditions characterizing dispersion of 
processes before considering construction of over-dispersed ones, which we proceed to do in Sections 3 and 5. In 
this section, we establish for univariate MCPs sufficient conditions for equi-dispersion in Theorem 1 and necessary 
conditions for over-dispersion in Corollary 2. This provides a starting point for our investigation. For example, 
it is immediate that compound Poisson processes (i.e., the class of compound MCPs with stationary independent 
increments) will be over-dispersed unless the event size distribution is degenerate at 1 , in which case one is back to 
the simple Poisson processes. We defer the more complete result of necessary and sufficient conditions for both equi- 
and over-dispersion to Section 4 where we consider multivariate processes. 

To understand what lies at the heart of equi-dispersion and of Theorem 1 consider the following expressions for 
the moments of the increments of a process {Nit)]: 

oo 

ElAN'itWit)] = 0''P(ANit)=0\Nit)) + VP(ANit)=l\Nit)) + le P{KN{t)=k\N{t)) 

k=2 

£[A^^(/)I^(/)] P{hN{t)^\m)) T.k'-PiANit)=k\Nit)) 
Irm ; = Irm ; 1- iim ; . (4) 

hlO h hiO h hlO h 

It is straightforward that the difference between any two infinitesimal or integrated moments comes from terms in the 
sum corresponding to increments of size larger than one, i.e. to simultaneous events. 

An immediate way to proceed to obtain sufficient conditions for equi-dispersion would be to require orderUness 
in the sense of Daley and Vere- Jones [9, page47|,i.e. P(AN{t) > 2) = o(/z), and to investigate under which conditions 
the h limit can be exchanged with the limit of the infinite sum in (4). In Theorem 1 we present such a result. We use 
the dominated convergence theorem to show that the limits commute under standard moment existence assumptions 
for a univariate simple MCP, which proves that it is equi-dispersed. 

An implication of Theorem 1 is that the Poisson process is not the only equi-dispersed counting process. In 
particular, Corollaries 3 and 4 point out that the Unear death process (or rather, the counting process associated with it) 
and the linear birth process, both extensively studied and used in applications, are also seen to be infinitesimally over- 
dispersed. Nonetheless, these two processes are integrally under- and over-dispersed respectively. These integrated 
dispersion consttaints are summarized in table 2 and are a direct result of the well-know property that their increments 
follow binomial and negative binomial distributions respectively. Another impUcation, pointed out in Corollaries 5 and 
6, is that mixing equi-dispersed MCPs with random variables does not alter dispersion. The fact that both the mixed 
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Poisson process and the birth process (with negative binomial increments) turn out to be integrally over-dispersed but 
infinitesimally equi-dispersed might be unexpected. 

The moment existence conditions we use in our results concem the total number of events that a MCP {Nit)] makes 
in an interval [t, t + h]. Specifically, define a stochastic bound of the infinitesimal rate function A{N{s)) conditional on 
N{t) =nhy 

A(f) = sup A{Nis)). (5) 

t<s<t+h 

Here, we suppress the dependence of A{t) on n and h. Now consider the following two properties: 

PI. For each t and n there is some h>0 such that E[A(t)] < oo. 

P2. For each t and n there is some h>0 such that y[A(;)] < oo. 

Properties PI and P2 require that the MCP does not have explosive behavior, and in particular they hold for any 
uniform MCP (i.e., a MCP for which q{n,k) = q{k)) in which the jumps are bounded by some (i.e., when for all 
k > ko, q{n, k) = 0). PI and P2 also hold for the simple, linear birth process and for the associated counting process to 
the simple, linear death process of Corollaries 3 and 4. 

Theorem 1 (sufficient condition for Markov infinitesimal equi-dispersion). Let {N{t)} he a simple, time homogeneous, 
stable and conservative Markov counting process. Supposing ( PI ), the infinitesimal mean is the same as the infinites- 
imal rate. Supposing (P2), the infinitesimal variance is also the same as the infinitesimal rate, and therefore {N{t)} is 
infinitesimally equi-dispersed. 

Proof. Let Pit) be a conditional Poisson process with event rate A(f) and work conditionally on A'(f) - n whenever the 
(potentially already conditional) expectation is taken over AA'(/) or functions of it. Then, since AN{t) is non-negative, 

E[AN{t)\ = E[AN{t) l{AN{t) > 0} ] = £[ l[AN{i) = 1} + AN{i) l{AN{t) > 1} ]. (6) 

Now, it is immediate that E[l{AN{t) = 1}] = X{n)h + o{h). Also, since {N{t)} is simple, {AN{t)} is stochastically 
smaller than {AP{f)) and 

E[AN{t) l{AN{t) > 1} ] < E[^P{t) l{^P{t) > 1 ) ] 

= E[E[AP{t) l[AP{t) > 1} |A(0]]. 

Using (6) with N{t) replaced by P{i), noting also that E[AP{t)\A{t)\ = hA{t) and E[ l{AP{i) = 1} |A(0] = hA{t) exp { - 
hAif)], it follows that 

E[AN{t) l{AN{f) > 1) ] < E\hA{f) - hkif) exp { - hA{f)]\ 
= E[hA{t){l - exp{-/iA(0})]. 

It follows by dominated convergence, since 1(1 - exp{-/il}) < A and by the assumption that E\Mf)\ is finite (note 
that the distribution of A{t) depends on h and not h), that 



E[hA{t)[\ - exp{-/jA(0})] 

hm ; = E 

hiO h 



limA(r)(l -exp{-/zA(r)}) 



= 0. 



Therefore, E{AN(t) l{AN(t) > 1 } ] = o(h) and E[AN{t)] = A(n)h + o(h). 

Similarly, replacing first by second moments, £:[(AA^(/))^ I{AA^(/) > 1}] = o(h) and El(AN(t))^] = A(n)h -H o(h), 
since 

E[iANit)f l[ANit) > 1 } ] < E[iAPit)f I{AP(0 > 1 } ] 

= E[E[iAPit))H{AP{t)>l}\A{t)]] 

= E[hA{t) + h^A\t) - hUf) eM-hUf)}] 
< E[lh^A\t)\ = o{h), 
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E[ANit)\Nit) = 


n] 


ah n(e^'' - 1) 


(do-n)(l-e-*) 


V[ANit)\Nit) = 


n] 


ah n€P\eP^ - 1) 


{do - n)(l - 


D^ino, t) 




1 ^' 




DdNin) 
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Table 1 : Increment mean, increment variance and dispersion indices of tlie time homogeneous (a) Poisson process (with individual infinitesimal 
rate a), (b) linear birth process (with individual infinitesimal rate /J and initial population n) and (c) counting process associated with {N(t)), a linear 
death process (with individual infinitesimal rate i5 and initial population do). 

where the last line follows by 1 - exp{-x) < x and E[]s}{t)'\ being finite. 

Equi-dispersion follows from y[AA'(f)] = E[{^N{f)f■\ - E[ANit)f = X{n)h + o{h), where E[hJ^{f)f is o{h) by 
stability of {A^(f)} which implies A{n) < oo for all n. □ 

Corollary 2 (necessary condition for Markov infinitesimal over-dispersion). Let [N{t)) be a simple, time homoge- 
neous, stable and conservative Markov counting process. If, supposing (PI), the infinitesimal mean is not the same 
as the infinitesimal rate or if, supposing (P2), the infinitesimal variance is not the same as the infinitesimal rate or the 
process is not infinite simally equi-dispersed, then q(n, j) + for some j > 2, i.e., {A^(f)} must be a compound process. 

Proof Otherwise, by Theorem 1, the infinitesimal mean of {A^(f)) must coincide with the infinitesimal rate under P(l) 
and so must the infinitesimal variance under P(2) respectively, i.e. {A^(f)) must be infinitesimally equi-dispersed. □ 

Corollary 3 (infinitesimal equi-dispersion of birth process). A MCP with q{n, 1) = /3nl{n > 0) and q{n,k) - for 
k> lis a simple linear birth process for e and is infinitesimally equi-dispersed. 

Proof. This is a special cases of the multivariate Corollary 14 which is proved in Section 4. □ 

Corollary 4 (infinitesimal equi-dispersion of death process). AMCP with q{n, 1) = 5((io-n)I{n < do} andq{n,k) = 
for k > 1 is the counting process associated with {N{t)}, a simple linear death process with initial population do e N, 

for (5 € and is infinitesimally equi-dispersed. 

Proof. This is a special cases of the multivariate Corollary 14 which is proved in Section 4. □ 

2.1. Dispersion of mixed Markov counting processes 

The mixed Poisson process in Daley and Vere-jones [9], which Snyder and Miller [30] call Polya process, is a 
natural extension of the Poisson process where a mixing random variable M is used as the rate to define a Poisson 
process conditional on M. An immediate result of this mixing is that the resulting process is integrally over-dispersed. 
It is straightforward to generaUze this notion to simple mixed MCPs, where {N{t)} is specified as a MCP conditional 
on M with rate function A(n) = A(n, M). Mixed MCPs are non-Markovian but the measures of dispersion defined 
in (3) and (2) can still be computed and discussed. For non-Markovian processes, conditioning on the entire past 
history in (3) could also be considered. 

Theorem 5 extends Theorem 1 by showing that conditions PI and P2 ensure the equidispersion of simple mixed 
MCPs. Here, the analogous definition to (5) is Mf) = supj<j<j+^ A{N{s)). A proof of Theorem 5 is given in Appendix 
A. In the context of mixed MCPs, PI or P2 imply that E[A{n)] < oo, so the tails of the additional randomness resulting 
from M are required to be not too heavy. 

Theorem 5 (sufficient condition for mixed Markov infinitesimal equi-dispersion). Let {N{t)] be a simple, time homo- 
geneous, stable and conservative Markov counting process conditionally on a mixing random variable M. Supposing 
(PI), the infinitesimal mean is the same as the average infinitesimal rate. Supposing (P2), the infinitesimal variance 
is also the same as the average infinitesimal rate, and therefore {N{t)} is infinitesimally equi-dispersed. 
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Corollary 6 (infinitesimal equi-dispersion of mixed Poisson process). A conditional MCP with q{n,\,M) = M and 
q{n, k,M) = Ofor k> I is a mixed Poisson process and is infinite simally equi-dispersed ifE[M] < oo. 

Proof. This follows directly from Theorem 5. □ 

3. Over-dispersed Univariate Markov Counting Processes 

From Section 2, we know that simple MCPs are equi-dispersed under standard moment conditions. We therefore 
seek to generalize standard simple MCP models, to relax this dispersion constraint. Our first approach is to investigate 
random time change, or subordination, which we show is equivalent to the inclusion of continuous-time noise in the 
rate function. Then, in Section 3.1.4, we are led to consider a subtly different approach of defining an over-dispersed 
MCP via the limit of a sequence of processes in which discrete-time noise is used to modify the rate. 

We know from Section 2 that introducing noise via a mixing random variable in the rate function does not alter the 
equi-dispersion of simple MCPs. In other words, this additional variability disappears infinitesimally. This suggests 
considering more complex, alternative noise processes. One possibility is to introduce some continuous-time process, 
say {j](t)], in the rate function of the MCP. Such constructions may be expected to give processes which are Markov 
conditional on {77(f)! but not unconditionally. Our approach is similar to that of [24] and [33]; we propose defining a 
process by replacing A(n), the deterministic rate function of the original MCP, in Kolmogorov's backward difl'erential 
system by the stochastic process [A(n,T]{t))} (see Appendix C for a formal definition). However, by taking {t](t)] 
to be a suitable white noise process, we differ from [24] and [33] by constructing processes which will be shown 
to be unconditionally Markov. The consideration of non-white noise is no doubt appropriate in some applications, 
but white noise provides a relatively simple extension to equi-dispersed processes controlled by a single intensity 
parameter. Staying within the class of Markov processes also facilities both theoretical and numerical analysis of the 
resulting models. 

The noise process {77(f)} could enter A{n) additively or multiplicatively. Given the non-negativity constraint on the 
infinitesimal rate functions, multiphcative non-negative noise is a simple and convenient choice. We refer to white 
noise, {^(f)} = {dL(t)/dt], as the derivative of an integrated noise process {L(t)] which has stationary independent 
increments. Note that we do not necessarily require that the mean of L(t) is zero. Although {^(f)} may not exist, 
in the sense that {L(t)] may not have differentiable sample paths, {^(f)} can nevertheless be given formal meaning 
[20, 5j. Restricting {^(f)} to non-negative white noise, the family of increasing Levy processes provides a rich class 
from which to choose the integrated noise {L{t)]. Multiphcative unbiased noise is achieved by requiring E[L{t)] = t, 
in which case hm;,|o E[AL(t)]/h = 1 . 

From an alternative perspective, in the context of the general theory of Markov processes, random time change or 
subordination of an initial process is a well established tool to obtain new processes. Following Sato [29], let {M(f)} 
(the directing process) be a temporally homogeneous Markov process and {L(f)} (the subordinator) be an increasing 
Levy process. Any temporally homogeneous Markov process {A^(f)) identical in law to [M o L(f)) = {M(L(f))) is said 
to be subordinate to {M(f)} by the subordinator [Lit)}. 

Theorem 7 below (proved in Appendix C) formally states that subordinate processes to simple (and hence equi- 
dispersed) MCPs are equivalent to solutions of Levy-driven stochastic differential equations resulting from introduc- 
ing unbiased multiplicative Levy white-noise in the deterministic Kolmogorov backward differential system of the 
directing process. This gives us a licence to interpret noise on the rate of an MCP as subordination of the MCP to a 
Levy process. In Subsection 3.1, we obtain exact results when investigating concrete examples of over-dispersion by 
exploiting this connection between gamma white noise in the rates and gamma subordinators. The general arguments 
of Appendix C and the particular processes of Subsection 3.1 may both be of interest to the reader, but to preserve 
the flow of the main themes of this paper we have chosen to defer the technical details involved in the link between 
subordination and stochastic rates to an appendix. 

Theorem 7 (Levy white noise and subordination). Consider the simple, time homogeneous, stable and conservative 
Markov counting process {Mx{t)] defined by the rate fimction A(m). Let {L(t)} be a non-decreasing. Levy process 
with L(0) = and ^[^(f)] = f. Let {M^(f)} be the process resulting from introducing unbiased, non-negative, 
multiplicative, Levy white-noise {^(f)} = {dL{t)/dt] in the rate of{M^{t)], defined as the solution to the Levy-driven 
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Kolmogorov backward differential system in (C.3). Then, if this solution exits and is unique, 



M^^it) ~ o L(t) ~ M,(XV(«) du). 



3.1. Subordinate processes to simple MCPs by gamma subordinators 

A convenient candidate for non-negative continuous-time noise is gamma noise. In this case, the integrated noise 
process L(t) - T(t) is a Gamma process defined to have independent, stationary increments with r(t) - r(s) ~ 
Gainma([f - s]/t,t). Here, GammaCo;,/?) is the gamma distribution with mean and variance a0^. In Subsec- 
tions 3.1.1-3.1.3, we study the inclusion of gamma noise in the rates of the Poisson process, linear birth process and 
linear death process, each of which have been shown to be equi-dispersed in Section 2. 

Following the convention for naming of subordinate processes [29], we will place the name of the original process 
first, followed by the name of the driving subordinating noise. We have chosen to study in detail the Poisson, linear 
birth and linear death processes because they are basic blocks widely used to build more complex, multi-process 
models, such as compartmental models used in population dynamics and queuing networks in engineering. What 
makes these three processes fundamental is that they capture in the simplest way, i.e. linearly, the most common 
possibilities in real applications. Namely, events that by occurring "kill" the potential for future events (death process, 
or negative feedback); events that "reproduce" meaning that their occurrence fuels that of future events (birth process, 
or positive feedback); and events which occur independently of the events which have already happened (Poisson, or 
immigration process, or no feedback). However, our approach could be extended to other processes that might be of 
interest. 

For these processes we provide three results: their first two moments about the mean, which show they are indeed 
over-dispersed; the distribution of the counting process, which allows for exact, direct simulation of the counting 
process; and a closed form for the infinitesimal probabilities, which fully characterize the processes and may be used 
for exact simulation of the event times of the point process and for indirect, exact simulation of the counting process 
by aggregation. 

Since, as shown in Section 4, multivariate processes built upon univariate processes retain the dispersion con- 
straints of the latter, constructing over-dispersed multivariate processes, a conceptually more complex task, can be 
achieved using the provided infinitesimal probabilities of over-dispersed univariate processes as building blocks, the 
same way it is routinely done with equi-dispersed processes. This highlights the relevance of the univariate results in 
this section. 

3.1.1. The Poisson gamma process 

We construct an over-dispersed Poisson process. This is a special case of the general compound Poisson process 
[9], which can be constructed as independent jumps from an arbitrary distribution occurring at the times of a Poisson 
process. Our alternative construction, derived through introducing white noise on the rate, has an advantage that it 
can be applied (as we show) not just to Poisson processes but to more general univariate and multivariate processes. 

Proposition 8 (Poisson gamma process). Let {M(t)} be a MCP with gin, I) = a and q{n,k) - 0, i.e. a time 
homogeneous Poisson process with rate a. Introducing continuous-time gamma noise {^{t)} = {dr(t)/dt] where 
Y{t) ~ Gammait/T,T) defines [Nit)], a compound infinitesimally over-dispersed MCP with increment probabilities 
fork€N 



where co = ra, p = (l + co) and we use Gfor the gamma fiinction. The infinitesimal probabilities are 



for k> I and zero otherwise. The infinitesimal rate is 

A{n) = T-4og(p-i) 

The infinitesimal moments are HdN = o and cr^^^ = (1 + Ta)fidN with dispersion DaNin) = 1 -i- ra. 
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3.1.2. The binomial gamma process 

Here, we consider multiplicative gamma noise on the rate of a linear death process. This process has been proposed 
as a model for biological populations [5], although it was defined as the limit of discrete-time stochastic processes 
rather than as the solution to the Levy-driven Kolmogorov differential system of (C.3). It is standard to define death 
processes as decreasing processes, however our general framework has been for counting processes which are neces- 
sarily increasing. To resolve this minor point, we will use the following notation. For some positive integer do, define 
{A{f)} = {max{<io - Mt), 0}} where {Ait)} is a MCP. The tilde represents then a transformation which, when applied to 
a MCP, defines a non-increasing Markov process which may be thought of as the number of individuals still alive by 
time f out of the initial when the MCP {A{t)\ counts the number of deaths. 

Proposition 9 (binomial gamma process). Let [M(t)] be a MCP with q{m, 1) - (do - m) I[m < do) and q{m, k) - 
for k > I, i.e. the counting process associated with a linear death process {M(?)} with individual death rate S € R"^ 

and initial population size do e N. Introducing continuous-time gamma noise l^(t)} = [dT{t)ldt] where T{t) ~ 
Gamma (t/ t, t) defines {N(t)}, a compound infinitesimally over-dispersed MCP with increment probabilities 



PiAN = k\N(f) = n) = Q g y)j,-\f-\\ + 6T{h - ii) 
and infinitesimal probabilities 

Ik 



-hT- 



qin, k) 



fijr n <do and k & {0, . . . ,h], and zero otherwise. Here, h = d^-n. The infinitesimal rate is 

A{n) = T-^\n{l+6Th) 

The infinitesimal moments and dispersion are 



DdNin) = 1 -I- (n - 1) 



21n(l +5T)-ln(l -h2(5t) 



ln(l -I- 6t) 

Hence, {A^(OI '-^ infinitesimally over-dispersed for h > 1 andequi-dispersedforh = 1 



3.1.3. The Negative Binomial Gamma Process 

Unlike for the death process, when introducing gamma noise to the birth process we are only able to show existence 
of moments imposing a restriction on the parameter space. In particular, the birth rate of the original process imposes 
an upper bound on the over-dispersion. When this restriction does not hold, the moments of the resulting process do 
not exist, and hence our dispersion index is not defined. We include the derivations with gamma noise for consistency 
with the Poisson and death process. Considering a common subordinator for all three processes has the advantage that 
it leads naturally to the multivariate situations in Section 4, in which over-dispersed univariate processes are combined 
to construct multivariate models. It would be possible to use other subordinators, such as the inverse Gaussian process, 
for which the moment generating function is available in closed form. 

Proposition 10 (negative binomial gamma process). Let {M(t)\ be a MCP with q(m, 1) = /5ml{m > 0) andq{m,k) - 
for k > \, i.e. a linear birth process for /3 e M^. Introducing continuous-time gamma noise {^(t)} = {dr(t)/dt} where 
r{t) ~ Gammait/T,T) defines {N{t)], a compound infinitesimally over-dispersed MCP with increment probabilities 
for k,n€'N 



PiAN = k\Nit) ^n) = 
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and infinitesimal probabilities 



q{n,k) = 



and zero otherwise. The infinitesimal rate is 



A{n) = T-Hn{\+l3Tn) 



For 2/3t < 1, the infinitesimal moments and dispersion are 



IJ-dN 



dN 



fidN + nt 



-1 




DdNin) 



l+in- 1)1 



21n(l -/3t) -\n(l -Ifir) 
-ln(l -ySr) 



Hence, {N(t)] is infinite simally over-dispersed fi)r n > 1 andequi-dispersedforn = 1. 

3.1.4. The binomial beta process 

The infinitesimal moments of the binomial gamma process are a non-linear system of two equations which, to 
obtain a desired mean and variance, needs to be solved numerically for S and t with which the process is actually 
parameterized. A moment-based parameterization allows to easily change the variabihty (via the variance) for a 
fixed location (fixing the mean), allowing for an easy interpretation of parameter changes. Other parameterizations 
may require changes in several parameters to achieve the same goal. In the context of counting processes, such 
parameterization has the additional advantage that it permits a direct and straightforward comparison with analogous 
stochastic difl'erential equations. 

As an alternative, the binomial beta process, as defined below, can be easily parameterized in terms of the in- 
finitesimal moments. Instead of introducing continuous-time noise to the rates, we consider introducing it directly to 
the event probabiUties of the death binomial process. Since the constraint on probabilities is the unit interval we need 
to consider an alternative to gamma noise. An obvious alternative would be beta noise. The construction of a beta 
process as a process with beta independent increments is not, however, straightforward [17]. We therefore consider 
in this section an alternative construction of compound processes based on noise introduction consisting on taking 
limits of discrete-time processes [5]. For this construction, let {IIo, Hi , 112, ■ • ■! be an infinite collection of independent 
and identically distributed random variables and, for each fixed h > 0, define the continuous-time process {n/,(f)) 
by Hhit) = n, for t e [ih, (i + l)h). For each h > 0,we can conditionally define a death process with time-varying 
rate {n/,(f)). Integrating out over the distribution of {!!/,(?)} results in an (unconditional) process for each h > 0, 
and the resulting infinitesimal probabihties in the limit as h tends to zero define a Markov process. As proved in 
Proposition 1 1 below, when H, ~ Beta(ar,;8), this construction defines the infinitesimal probabilities of a new process, 
which we call the beta binomial process. Here Beta(ar,jS) is a beta distribution with mean a/ia + p) and variance 



In spite of the more convenient parameterization, the integrated increment probabilities of the beta binomial pro- 
cess are not obtained as a byproduct as in Sections 3.1.1-3.1.3. Hence, exact simulation of the counts is only possible 
(with the present results) by aggregation from exact event time simulation based on the provided, defining infinitesimal 
probabilities. As in Section 3.1.2, let M{t) = do - M{t). 

Proposition 11 (binomial beta process). Let {M{t)] be a MCP with q(m, 1) - 6{do - m) l{m < do] and q(m, k) = 
for k > \, i.e. the counting process associated with a linear death process {M{t)} with individual death rate 6 e 
and initial population size do 6 N. Let c — ^ - 1 for h > I and c E otherwise with < w < n - 1. Introducing 
continuous-time discretized beta noise {^h{t)} in the death probabilities over an interval \t, t + h], where n(t)h ~ 
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Beta(^ c(l - e ce and taking the limit — > 0, defines {N{t)}, a compound infinitesimally over-dispersed MCP 
with infinitesimal probabilities for n< do and k € [0, . . .,h] 



(h\ G(k)G(c + h-k) 

q{n,k) = c6 

\k) G{c + n) 



and zero otherwise. Here, h = do - n. The infinitesimal moments are fid^ = nS and, for h > I, cr^j^ = (1 + (^)fJ^dN with 
dispersion Di^iri) = I + oj so that {N(t)] is infinitesimally over-dispersed. Ifh-1, HdN - cr\i^ = h6 and {N(t)] is 
equi-dispersed. 

4. Markov Counting Systems and Multivariate Dispersion 

Although univariate counting processes have applications in their own right, more complex scenarios will in 
general require multivariate processes. We begin this section by defining a multivariate extension of MCPs and gener- 
alizing the dispersion indices of Section 2 to such multivariate processes, after introducing some necessary additional 
notation. The study of integrated moments of multivariate processes has given rise to a substantial literature, often 
considering truncation of the moment generating function [25]. Even though these integrated moments have been 
analyzed, we are not aware of any attempt to formally treat dispersion of a multivariate process. In this section, we 
do this from both an integrated and infinitesimal perspective, focusing on the latter. Then, we establish sufficiency 
and necessity of simpleness and compoundness for equi- and over-dispersion respectively for multivariate processes, 
generahzing the results of Theorem 1 and Corollary 2. We illustrate this generalization by establishing equi-dispersion 
of multivariate birth-death processes. This result will lead us in Section 5 to show that the univariate over-dispersed 
processes of Section 3, together with other standard univariate processes, can be used as building blocks to construct 
multivariate processes which retain the dispersion of the univariate blocks. We will illustrate this point by constructing 
an over-dispersed multivariate birth-death process. The integrated properties of the resulting new multivariate process 
will, however, generally be different from those of the univariate blocks. This is further evidence for the conceptual 
simplicity arising from a focus on infinitesimal dispersion. Henceforth, as earUer, we will mean infinitesimal dis- 
persion when we simply talk about dispersion. In this "building block" approach, it is straightforward to consider 
simultaneous events in any given block (e.g., blocks of the form of the over-dispersed processes of Section 3). How- 
ever, this univariate-flavored approach is not amenable to simultaneous events among different blocks. In the context 
of the processes of Section 3, such an approach would require considering dependencies among different white noises, 
which is outside the scope of this article. 

In order to define a multivariate extension of MCPs, consider a finite family of counting processes {{Nijit)} : t e 
R^, ; i el, j e J'] with starting conditions Nij(0) - 0. We will refer to the multivariate, vector- valued 

process formed by such a family of processes as [Nit)], and we will use bold letters for multivariate processes. The 
family member {Nij(t)] counts events of the ij-type. Each event type can be interpreted as a transition from one state 
(which we call the initial condition) to another (called the final condition). The set I (and consists of all possible 
initial (and final) conditions, and the set of all possible conditions we call ( = IIJJ'. As an illustration, consider 
the simple, linear birth process, where sets I and J" could be defined as J = {"unborn") and = {"alive"] and 
N "alive" "unborn"(f) - 0. To casc notation in the subindices, we will assume in what follows that the elements of I and 
are relabeled so that ^ = {1, . . . , C) with C being the cardinality of ^. Let a Markov counting system (MCS), with 
associated multivariate counting process {Nit)], be the integer- valued Markov process {Xit)] = {(Xiit), . . . ,Xcit))] for 
XiO) e Z*- defined by "conservation of mass" identities 

Xcit) = XciO) + 2 Nicit) - 2 Ncjit), (7) 

i¥c j^c 

for c e 1, . . . , C so that Z(f) e Z*^ The infinitesimal probabihties of the MCS {Xit)] are defined, via (7), by the 
infinitesimal probabilities of {Nit)] which are in turn assumed to be a function of Xit). Specifically, we define the 
non-zero infinitesimal probabihties to be 

^ ,^ PiANij =k and ANim = for il,m)i^ a, j)\Xit)=x) 
q,ix,k) . hm (8) 
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for i + j, k> 1 and x €lf. Analogously to Section 2, define the rate function of this MCS to be 

1 - PiANij = for all (i, J)\Xit) = x) 

A(x) = lim ; . 

hio h 

Again we restrict ourselves to stable and conservative processes so that A{x) - Yjijk liji^^ k) < ca for all x, with the 
sum being over i,j € ^, i j, k > 1. As motivated above, we have restricted the processes under consideration to 
those where simultaneous events are possible but only for a given process at any event time. This is a natural choice 
since our ultimate goal is to construct multivariate over-dispersed processes using univariate ones as building blocks. 

{X{t)} can be interpreted as a compartment model [19] with C compartments and where A^y(0 counts the direct flow 
from compartment / to j. Equivalently, it can also be interpreted as a queuing network. The conservation equations 
in (7) require that the sum of the C processes {Xdt)] remains unchanged at any given moment in time. However, 
they do not require that the {Xdt)} processes be bounded, as would occur in a standard compartment model with fixed 
total population size in which compartment counts are necessarily non-negative. This is achieved by allowing at least 
some {Xc{t)] to be negative. As examples, we consider the possibility of birth, immigration and death events. To 
model births or immigration or both into compartment r, we can require by construction that there is some s for which 
Xsit) = -Nsrit). Then, Xsit) keeps track of the negative count of births or immigration which enter r (and so lead to 
an increase in Xrit)). We call Xsit) a source process. One can similarly define a sink process to count death events. 

We sUghtly generalize the definition of dispersion indices of univariate counting processes as follows. We will 
refer to 

^ij ^ hmAio h-'V[Nij(t + h)- Nij(t)\X(t) ^ X] ^ crl^'(x) 
''^^''^ " limAio h-^E[N,j(t + h)- Nij{f)\X{f) = x]~ ^'^^(^x) 

as the infinitesimal dispersion index of {Nij{t)}, the ij-marginal counting process associated to the MCS {X(t)]. Since 
the infinitesimal moments in (9) are a function of x and are directly related to the increments of {X(t)], we refer to 
the collection {D'^^{x) : i € I, j € J'] as the dispersion of the multivariate MCS {X{t)]. Following our approach of 
studying simultaneous events only on single transitions (equivalent here to noise processes {^ij(t)] which are indepen- 
dent for i + j), we do not need to consider infinitesimal covariance here. We correspondingly define the integrated 
counterpart of (9) as 

E{Nij{t) - ^7(0)1^(0) = xo] ■ 

We will now say that an /j-marginal counting process {Nij(t)] is (integrally) over-dispersed if D'j^ > 1 (Z)^ > 1) for 
j e We will also say that a MCS \X{t)\ is (integrally) over-dispersed if > 1 (Z)^ > 1) for at least some i, j e f . 
As in Section 2, we define under- and equi-dispersion accordingly. 

We now prove Theorem 12 below, which gives the infinitesimal mean and variance of the jj-marginal counting 
processes associated with a MCS. Similarly to (3) in the univariate section, these two moments are the denominator 
and numerator of (9) respectively. Note that these moments are now conditional on X{t) - x because we have 
allowed in the definition of a MCS in (8) for possible dependencies between different ij'-marginal processes. We use 
Theorem 12 to prove the corollaries which complete this section regarding both sufiicient and necessary conditions 
for dispersion of MCSs. We will also use Theorem 12 in Section 5 to prove that dispersion of MCSs constructed using 
univariate blocks retain the dispersion properties of the blocks. The moment existence conditions we use concern now 
the total number of events that {Nit)}, associated to {X(t)] through (7), makes in an interval [t, t + h] and the size of 
simultaneous events in this same interval for each ij marginal. Analogously to Section 2, define now 

Aij(t)= sup AiX(s)). 

t<s<t+h 

Note that we use the same bound for all //'-marginal processes. Since we are allowing the possibility of compound 
processes, we also need to define a stochastic bound conditional on X{t) - x for the size of simultaneous events 

Zijit) = sup dNij(s). 

t<s<t+h 
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Again, we suppress the dependence of Ay (;) and of Z,/?) on x and h. Now consider: 

P3. For each t, x and i + j there is some h> such that £'[Zy(?)Ay(?)] < oo. 

P4. For each t, x and i + j there is some h> such that V[Zij{t)Aij{t)] < oo. 

Properties P3 and P4 again require that the jj-marginal counting processes do not have explosive behavior, and in 
particular they hold for simple birth-death processes with linear birth and death rates, as shown in Corollary 14. 

Theorem 12 (infinitesimal moments of a MCS). Let {X{t)] be a time homogeneous, stable and conservative Markov 
counting system with associated multivariate counting process [N{t)] defined by (7) and (8). Supposing (P3), the 
infinitesimal mean of {Nijit)] is the first moment of its jump distribution, i.e. fi^x^x) = Yjk^Qiji^'^)- Supposing (P4), 

its infinitesimal variance is the second moment of its jump distribution, i.e. o-^x^x) = 'Z.k l^lijix, k). 

Proof. Let Nijit) be a conditional compound Poisson process with event rate A(t) = A,y(/) and degenerate jump 
distribution with mass one at Z(t) = Zij{t). Work conditionally on X{t) = x whenever the (potentially already 
conditional) expectation is taken over ANijit) or functions of it. Let S be the event of exactly one single jump (of size 
one or more) in any of the {Nij{t) : i + j] counting processes occurring in [t,t + h], as in Lemma 19. Then, 

E[ANij{t)] = E[ANijm{SU + E[ANijmiSll (10) 

Unlike in Theorem 1, the term corresponding to one single jump is not immediate and requires Lemma 19. Letting 
S ij be the event of exactly one single jump (of size one or more) in the ij counting process occurring ia[t,t + h], 

ElANijitmS}] = E[ANij\Sij]P(Sij\S)P(S) 

q (x k) 2 'iiM' 

k i,j,k ■' 



= hy]kqijix,k) + oih) 



k 

Analogously to Theorem 1, we proceed to bound the second term to show the desired result. Since ANij{t) is stochas- 
tically smaller than ANijit), 

E[ANijit)I{S'}] < ElANijit)l{S'}] 

= E[E[ANijit)l{S']\Ait),Zit)]] 

Using_(10) with Njjit) replaced by Nijit) and since E[ANijit)\Ait), Z(/)] = Z(/)/iA(/) and E[ANijit) I{S } |A(0, Zit)] = 
Zit)hAit) exp{-M(0}, it follows that 

E[ANijit) I{S'}] < E[Zit)hAit) - Zit)hAit) exp{-M(0}] 
= E[Zit)hAit)il - exp{-M(0})]. 

As in Theorem 1, it follows by dominated convergence, since zl(l - exp{-hA] < zl) and E[Zit)Ait)] is finite, that 

r E[Zit)hAit)il-exp{-hAit)])] 7,AA.An ; ^a.aim n 

Imi ; - £[limZ(f)A(f)(l - exp{-/!A(f)))] = 0. 

hlO h hlO 

Therefore, E[ANijit) 1(5 ] - oih) and the result for the mean follows. Replacing first by second moments, the result 
for the variance follows since 

EliANijit))H{S'}] < EliANijit))H{S'}] 

= E[E[iANijit)f 1{S I A(0, Z(0]] 

= E[Z\t)hAii) + Z\t)h^A\t) - Z\t)hAii) exp -hAii)] 

< E[lZ\t)h^A\t)] = oih), 

since E\Z^it)A?it)^ is assumed to be finite. □ 
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Theorem 12 makes it straightforward to characterize dispersion of MCSs according to whether the y-marginal 
processes associated to {X{t)} are simple or compound. 

Corollary 13 (sufficient and necessary conditions for Markov infinitesimal dispersion). Let {X(t)\ be a time ho- 
mogeneous, stable and conservative Markov counting system with associated multivariate counting process {N{t)]. 
Supposing (P3) and (P4), it is sufficient and necessary that an ij-marginal processes {Nij{t)} associated with {X{t)] be 
compound (simple) for it to be infinitesimally over-(equi-)dispersed. 

Proof. Let us first estabhsh sufficiency. Using Theorem 12, for those ij-raaiginsi processes {Nij{t)} which are simple, 

:j ^ I.kJ^qijix,k) ^ q.jjx, 1) ^ 

J.kkq,fx,k) q,.{x,\) ' 

For those ij-marginal processes {Nijit)] which are compound, 

D'Ux) = > 1 

""^ Zkkqijix,k) 

since > k for k > 1. Hence, an infinitesimally equi-dispersed process must be simple because compoundness 
suffices to establish over-dispersion. Analogously, an infinitesimally over-dispersed process must be compound, es- 
tabhshing necessity. □ 

As an illustration of this result, consider the following MCS. Define the sets J^^ = ["unborn", "alive") and 
j-BD _ {"flZ/yg"^ "dead"} so that - {"unborn", "alive", "dead"} which we relabel as = {1,2,3). Then, let 
{Z*^(f)} with starting conditions (Xf ^(0), X^^(O), X^"(0)) E Z x N x N be the MCS with associate multivariate count- 
ing process {N'^'^it}} = [Nf^(t), N^^{t), Nf^(t), N^j^it), N^^{t), N^^{t)] with Nj^'^(t) = for ij e {21, 13,31,32} 
and defined by infinitesimal probabilities for A: > 

q^^ix, k) = 13x2 1{X2 > 0, A; = 1 } 
ql^ix, k) = 6x2 1{X2 > 0, A; = 1 } . 

{X2{f)} is conmionly referred to as simple, linear birth-death process with death rate 6 and birth rate p. This MCS 
representation of the process departs from its usual univariate representation by extending the state to include X\{t) - 
-Nnit), which keeps track of the (negative) number of total births, and X^it) - N2i{t), which keeps track of the 
number of total deaths. 

Corollary 14 (infinitesimal equi-dispersion of birth-death processes). A simple, linear birth-death process with death 
rate S €R^ and birth rate p €.R* is infinitesimally equi-dispersed. 

Proof. Use Corollary 13. Since the maximum jump of both non-zero / j-marginal processes {N^^it)) and {N^^it)} is 
one, we only need to check existence of £[A^^(0|X^^(0 = x] and of y[A^^(0|X^^(0 = x\. This is granted since 

A"°(X'^°(t)) = <J3 + 6)X^°(t)l[Xl°{t) > 0) 
is stochastically bounded by (yS + S)B{t), where [B{t)} is a simple, linear birth process with birth rate p. Then 

£[A^^(0|Z^^(/) = x] < £[ sup (]3 + 6)B{s) \B{t) = X2] = E[{J3 + 6)B{t + h) \B{t) = X2] < co. 

t<s<t+h 

By the same argument V{K^P{t)\X^°{t) = < 00 and the result follows for the non-degenerate case of X2> □ 
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5. Over-dispersed Markov Counting Systems 

After establishing equi-dispersion of simple multivariate processes in Corollary 13, we proceed to construct over- 
dispersed multivariate processes. As we have advanced, we will pursue this goal using a "building block" strategy. 
Moving from a collection of univariate blocks to a multivariate MCS is not immediate. In our implementation of the 
"building block" strategy, we proceed in two steps. First, we illustrate how a univariate MCP may be equivalently 
written as a bivariate MCS by endowing it with sets I and !J and adding an additional state. We do this to facilitate 
notation in the rest of the paper. Second, we show in Lemma 16 that "stacking" the infinitesimal probabihties of each 
block (now written as the corresponding bivariate MCS) results in a MCS that retains dispersion of the blocks. 

Consider writing the MCP {fi(OI, the simple, linear birth process of Corollary 3, with starting value B(0) e N, 
as bivariate MCS {Z*(f)}. This may be achieved by defining the sets = {"unborn"] and = {"alive"), so that 

= {"unborn", "alive"] which we relabel as = {1,2). Then, let {X'^{t)] be the bivariate MCS with associated 
multivariate counting process {N'^it)] = {Nf^it), A^fiW} and with Xf(0) e Z aiidX|(0) = 5(0), 

q^iix,k) = 0. 

Note that, by the definition of the infinitesimal probabihties, it follows that N!^^(t) = and that B(t) - B(0) ~ Nf^{t). 
Had {B{t)} been the negative binomial gamma process of Section 3.1.3, the corresponding MCS could have been 
defined letting 

qUx,k) = |-^2+^^-ljg|fcJ^_^^,_,,i^^,^_H^^j^^^,^^^^^_ .^^ 
ql^{x,k) = 0. 

Consider now writing the MCP {D{t)), the counting process associated with a simple, linear death process {D{t)} 
of Corollary 4, with starting value 6(0) € N, as the bivariate MCS {X'^(f)]. This may be achieved by defining now sets 

= {"alive"] and JT^ = {"dead"], so that = {"alive", "dead"] which we relabel as = (1,2). Then, letting 
{X'^{t)], with Xf (0) = 6(0) and X^(0) e N, be the bivariate MCS with associated multivariate counting process 
{]SI°{t)] = [N°^{t), N^^it)] defined by infinitesimal probabihties for A: > 

qi2{x, k) = 6xi l{xi > 0} 
q^^{x,k) = 0. 

Again N2\{f) = and D(t) ~ A^[^(f) and the following alternative infinitesimal probabilities would define the bivariate 
MCS corresponding to the binomial ganmia process of Section 3.1.2 

q^2(x,k) . |^^'jg|^j(-l)^-^>i(r°)-iln(l+5Ax,-;)) 
qf^(x,k) = 0. 

Now that we have a potential collection of blocks (bivariate MCSs corresponding to univariate MCPs), we move 
to the second step of our strategy and define in Definition 15 a MCS associated to a collection of blocks by "stacking" 
the infinitesimal probabilities of these blocks and show in Lemma 16 that the block dispersion remains unchanged. 
Note that Definition 15 complements Definition 8 in which MCSs were defined. From Definition 8, it is possible 
to decompose a MCS into a collection independent univariate MCPs of the ij-type by considering only sets of k 
infinitesimal probabilities q..(x,k) and their corresponding initial conditions. On the other hand. Definition 15 gives 
one possible construction of a MCS given a collection of independent univariate MCPs. We illustrate these results 
by constructing an over-dispersed birth-death process in Proposition 17. The general class of MCSs specified in 
Definition 15 also includes, for example, the over-dispersed compartment model of [5]. 
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Definition 15 (MCS associated to a collection of blocks). Let {{Z^CO) : € s} be a collection of independent time 

homogeneous, conservative and stable bivariate MCSs defined on sets J* = {;'') and = \f] by infinitesimal 
probabilities q''(x,k). We define the MCS associated with this collection of blocks to be the MCS defined on 

sets I = Ufces 3nd J' =[jb€S 3^^ via infinitesimal probabilities q^pv, k) = q^^'''^{{wi, wj), k) where B : J x -> S 
is defined so that B{i, j) = b implies J* = i and JT* = j. 

For Definition 15 to uniquely specify a MCS, we need to impose two conditions. First, since B(i, j) is the element 
in the block set S corresponding to transitions from i to j, we require that only one block in the collection counts 
events of the ij-type, so that B(i, j) defines only one element in the block set S, i.e. so that = i and J''' - j together 
imply B(i, j) = b. Then, the assignment of infinitesimal probabilities in Definition 15 is unique. Second, some care is 
needed with initial conditions of the blocks and how they correspond to W(0), the initial conditions of the MCS. We 
simply assume that W(0) is defined separately from {{X^{0)} : b € s], to avoid concerning ourselves with possible 
inconsistencies between the inital conditions of the blocks once they are juxtaposed. 

Lemma 16 (infinitesimal dispersion of MCSs associated with a collection of blocks). Let {W(t)} be the MCS associ- 
ated with collection of blocks ^{X^{t)] : b € S^. Then, supposing (P3) and (P4), the ij-marginal processes of{W(t)] 

have the same dispersion as the ij-marginal processes o/{{Z''(f)) : e s}. 
Proof. Using Theorem 12 and letting X{t) = X^^'-^\t) to simplify subindices, 

i:k%.(w,k) 



-iWi,Wj) = 



2 kq.jiw, k) 



Xk^q-j'-"((Wi,Wj),k) 



' i:kqf\(w,,wi),ky 

and it follows, by Definition 15, that 

□ 

In light of the equi-dispersion shown in Corollary 14, consider applying Lenrnia 16 to construct an over-dispersed 
birth-death process based on the over-dispersed processes of Section 3. 

Proposition 17 (infinitesimally over-dispersed birth-death process). Consider the collection of blocks \{X^{t)}, {Z^ (?)}}■ 

Here, [X^it)} is a bivariate MCS corresponding to the infinitesimally over-dispersed negative binomial gamma pro- 
cess of Section 3.1.3 and {X^{t)} is a bivariate MCS corresponding to the infinitesimally over-dispersed binomial 
gamma process of Section 3.1.2. Consider {Wit)}, the MCS associated to this collection defined, as in Definition 15, 
on sets I = ["unborn", "alive"} and ST = ["alive", "dead"} and ^ = {"unborn", "alive", "dead"} ,relabeled as 
( - [\,2,3}, with initial conditions W{G), via infinitesimal probabilities for k>\ 

q,,iw,k) = |^^^^^-lJgQ(-l)*-^>i(r«)->ln(l+Mw2+fc-y)) 

and, for k € {0, . . . ,wi} and W2 > 0, 

q,,iw,k) = (^^jg(^J(-l)W-i(T^)-iln(l+5Aw2-y)) 

and zero otherwise. The infinitesimal dispersion of{W{t)} is 

"21n(l-j8T^)-ln(l - 2/3t^) 



OjwW = 1 + (W2-1) 



-ln(l 




21n(l +5t^) 


-ln(l +25t^) 



ln(l + (Jt^) 
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for Ifii^ < 1. Hence, the ij-marginals of{W{t)} are infinitesimally over-dispersed for wi > 1 and equi-dispersedfor 

W2 = 1. 

Proof. This proposition follows analogously to Corollary 14 using Lemma 16. In this case, the rate function is 

k k 

= (t*)"' In (1 + ySr^WiW) + (t")"^ In (1 + dT^Wiiij) 
< iJ} + 6)W2{t). 

WiCO is now stochastically bounded by B{t), where {B{t)) a negative binomial gamma process playing the same role 
as the simple, linear birth process in the proof of Corollary 14. Hence, A,/?) is conditionally stochastically bounded 
by (j6 + 5)B{t + h). Another difference with Corollary 14 is that both {Nnif)} and {A^23(0} may jump by more than 
one unit, so we also need to stochastically bound Z,j, the maximum jump in [/, / + /z] of the marginals. Note that the 
maximum jump of the birth marginal [Nnif)] may be bound by B{t + h), i.e. any jump can be at most all of those bom 
in that time interval. The maximum jump of the death marginal {N2i{t)} may be bound by {WiCO + B{t)}, i.e. any jump 
can be at most all those ahve at time t plus all those born in that time interval. Then 

£[Zi2(0Ai2(f)|W(f) = H'] < E{{J3 + 6)B\t + h)\B(t) = wa] < <x, 
E[Z23it)A23it)\Wit) = w] < E[(fi + 6)Bit+h)w2 + B\t + h)]\Bit) = W2]<oo 

Hence, by finiteness of the fourth moment of the negative binomial gamma process, (P3) and (P4) hold and the result 
follows. □ 

Note that replacing q^2(^' ^) 123^^' ^) corresponding equi-dispersed infinitesimal probabihties in Propo- 
sition 17is possible and would yield infinitesimal over-dispersion in only one of the /y-marginal processes. 

6. Discussion 

We have shown in Section 2 that simultaneous events are required in order to obtain infinitesimal over-dispersion 
in MCPs. We now discuss heuristic interpretations and applications of this result. There are two distinct motivations 
for modeling simultaneous events: the process in question may indeed have such occurrences, or the process may 
have clusters of event times that are short compared to the scale of primary interest. In many applications, only 
aggregated counts and not event times may be available, in which case any clustering time scale which is shorter 
than the aggregation timescale may be appropriately modeled by simultaneous events. Either way, the conclusion 
remains that if one wishes to use models based on Markov counting processes which match infinitesimal dispersion 
characteristics of a system then the possibility of simultaneous events in these models is unavoidable. 

In the past, a modeling hypothesis that events occur non-simultaneously seems to have been favored. As already 
proved, this comes at odds with the desire of a better fit to data by allowing Markov processes to generate additional 
variability. To help reconcile intuition built on simple point process models with the models introduced in this paper, 
we give an interpretation of infinitesimally over-dispersed models based on the idea that the additional variability 
comes in the form of clusters of events which infinitesimidly turn into simultaneous events as the time length of the 
cluster tends to zero. 

Consider the bivariate process (Nit), M(t)) where {N{t)} is a simple univariate MCP with conditional infinitesimal 
event probability q{n, 1) = A{ri)M{t) and M{t) is a discrete-time noise process which is constant over time intervals 
{[?,-, ti + w] : fo, w 6 R+, ti - tj-\ + cli, / > 1} and has E\M{t)\ = 1. If the values of M{t) on distinct intervals [f,-, f,+i] 
were independently gamma distributed with variance inversely proportional to to and we let w J, 0, we heuristically 
would obtain the gamma processes from Section 3. 

Consider a time interval \t, t + h] with h » oj. In this new model (before letting a> J, 0), the single events 
would form clusters over time since there would be more events in the w-intervals where the integrated event rate was 
higher than the mean A(n)oj and fewer otherwise. More extreme (sudden) variations in the event rates would produce 
stronger clustering. This dependence induced by the clustering in turn increases the heterogeneity for a given mean 
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infinitesimal rate A(n). This interpretation is parallel to the common practice of modeling binomial over-dispersed 
experiments letting the parameter /? in a set of binomial experiments be stochastic, where the additional variabiUty 
reflects the dependence between the A' individuals in each experiment. 

Now let us consider more carefully the Umit of this process. Still considering time interval [t, t + h], clusters start 
happening over shorter time intervals as w X and there will be fewer and fewer events in each w-interval which will 
tend to obscure the clustering. However, clusters may still be perceived if there are rare realizations of M{t) which are 
extremely different from values in nearby w-intervals even as (o becomes small. One possibility to make sure these 
extreme differences for which clustering will still be apparent is to ensure there is enough variability in M(t) as a> 
decreases. It turns out that by letting M{t) be integrated white noise, one gets y[M(f)] oc w"' — > oo as w — > 0. This 
guarantees that V [M(f)] is large enough for clustering to be present in [t, t + h] as long as a> decreases as fast or faster 
than h, even in the limit as /i J, 0, i.e. infinitesimally. This heuristic leads to seeing simultaneous events as clusters of 
single events in intervals of length zero, i.e. simultaneous events can be seen as the limit of single-event clusters. 

Based on this interpretation of simultaneous events, two questions might be addressed. First, infinitesimally 
over-dispersed MCPs might be useful in applications where, even though exactly simultaneous events may be consid- 
ered impossible, single events can be clustered very tightly compared to the distance between event clusters. Then, 
infinitesimally over-dispersed MCPs may be a useful Markovian approximation to more complex, non-Markovian 
simple processes. In this approximation these tight clusters are approximated by clusters with distance between clus- 
ter members equal to zero. One instance of applications that fall in this category is infectious diseases where it is hard 
to imagine ever having event times data and where it is plausible that a group of susceptible get infected with very 
short inter-event times compared to the time until another infectious individual infects another susceptible. 

Second, if the goal is to model a process where it is natural to consider simultaneous events, there are other 
mechanisms that could be used, like a build-up of single events which, passed a threshold, becomes a batch event. 
Applications falling in this category include production systems and rental businesses [28], physical processes and 
quantum optics [12] and internet traffic [23]. In this case, the infinitesimally over-dispersed MCPs presented here 
might serve as a null "Markovian" model against which to test other more intricate models for dependence. 
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Appendix A. Proofs for univariate Markov counting processes 

Proof of Theorem 5( sufficient condition for mixed Markov infinitesimal equi-dispersion ). Letting the process be a mixed 
process means that A(/) is now stochastic because of its dependence on the process {Nit)}, as in the non-mixing case, 
but also on the random variable M. The result for the mean follows again by dominated convergence but the dominated 
functions are now 




Ail - exp{-hA))f^^M,N(t)=nU' m)dA< I Af^^MMt)=ni'^' 




for all m, and the dominating function has a finite integral since E[A{t)\ is assumed to be finite, i.e. 




'^fh\M,m)=ni^'^^^'^ fM\N(t)=nim)dm = E[E[Kit)\M]'\ < oo. 



Then 



lim 

hiO 



E[E[hAit)il - exp{-/iA(f)})|M]] 
h 



= E[Um Aa, M)(l - exp{-hAit, M)))] = 0. 
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The result for the variance follows again in the same Unes as for the non-mixing case, i.e. 

E[iANit)f I[ANit) > 1} ] = E[E[iANit)f I[ANit) > 1} |M]] 

< 2h^E[E[A\t)\Afn = oik), 

by the assumption that E[A-^{t)] < oo. These two results show that the same terms that vanished in Theorem 1 vanish 
now as well. Then, 

E[ANit)] = E[ I{ANit) = 1) ] + oik) = E[E[ l{AN(t) = 1 ) |M]] + o(h) 

= E[A{n) exp{-/iA(n + l)}^(/i)] + (A.l) 
= E[hAin) + oih)] + o{h) 

where (A.l) follows by Lemma 19. Here 

C h if A(«) = A(n+ 1) 

,p{h) = ] 1 - exp{-/i(A(n) - A(n + 1))} -fw.^w^,, 

T-TT — , if A(n) A{n + 1) 

^ A(rt) - A(n + 1) 

For A(n) > A(n + 1), 4'ih) < h and for A(n) < A(n + 1), (p{h) = 0(h). For h small enough, (p(h) < 1 and the functions 
inside the expected value in (A.l) are bounded by A(n). Then, taking Umits gives the desired result via dominated 
convergence 

Um ; = E[A{n)], 

hiO h 

where E[A{n)] < E[A{t)] < oo. The same argument gives ]imh~^E[{AN{t))^] = E[A{n)] and the dispersion results of 

hlO 

Theorem 1 follow. □ 

Proof of Proposition 8 (Poisson gamma process). We denote the Poisson rate p in this proof and reserve a for the 
gamma shape parameter. Since {N{t)] is a conditional Poisson process, 

PiANit) = k\Nit), Am) = " . 

ki 

It is a standard result that if pATit) follows a gamma distribution with mean ph and variance p^Th the distribution of 
the increments of {N{t)} is negative binomial with probabiUty mass function 



P(AN(t)=k\N(t)=n) = ] (A.2) 



G[T-^h + k) 
k\GiT-^h) " 

with p = {\+ Tp)~^ . The Umiting probabiUties follow by a Taylor series expansion about h = Q: 

P{AN(t)^Q\N(t)^n) = p^^'^ = \+T'^\og{p)h + o{h) 

(T-'h + 0(h)) ^ , 

P(AN(t) = k\N(t) = n) = -xn+T-^log(p)h + o(h))(l-pf 

(1 - pf 

= , h + oih), 



for k>0, using in (A.2) 

GiT] + k) ^ kr\r] + k-l)xiT] + k-2)x...xiT] + 2)xir]+l)xiT]) 



klGir,) 

k-\ k-i 
= Yjkr^(/)jhj X 77 = r'77 + Yj kr^(pjh^^^T- 

= k'^T] + oih), 
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with 77 = T ^h. Recalling that p = (l + ra) \ the moments follow by 

E[AN(t)mt) ^n] = !^ — 

P 

= {\ + Ta)T~^ h - h = ah 
V\_^N{t)\N{t) = n] = V-^ = (1 + '^a)ah 



□ 



Proof of Proposition 9 (binomial gamma process). Since {N{t)} is the counting process associated with a conditional 
Unear death process, the increment process is binomial with parameters size h and event probabihty n(;) = 1 - g-^^no^ 
i.e. 



PiANit) = k\Nit) = n, Am) = ^nco'Ci - nit))"'', 

fork € {0, 1 , . . . , n). We integrate out the continuous- time gamma noise using the fact that 6Ar(t) follows a gamma 
distribution with mean 6h and variance d^rh and completing the resulting incomplete gamma density making use of 
the multinomial theorem as follows 

00 

PiAN = k\Nit) =n) = j Q[l - ,-]*[,-]«-*^!^^dx 



=(:)/[§C)'-'>'i-""- 



^(:)/te).-..'-v«^. 



for A: G {0, . . . , h) and recalUng that a = hr'^ and j8 = S~^t~^. The limiting probabilities follow by a Taylor series 
expansion about h = 0: 

PiANit) = 0\Nit) = n) = (1 + 6Thy'''"' = 1 - t"^ hi (1 + 6Tn)h + oih) 

k 



-hT- 



PiANit) = k\Nit) = n) - Q g - hi (1 + 6Tih - j))h + 



E 

for A; > 1, since by the binomial theorem i; (*:)(-l)'="^ = (1 - 1)* = 0. 
The moments are 

E[ANit)\Nit) = nj = hE[nit)\Nit) = nj = hE[l - e'^'^^'^lNit) = «] 
V[ANit)\Nit) = n\ = V[nnit)\Nit) = n] + £[nn(0(l - n(0)|A''(?) = n] 

= E[ANit)\Nit) = n] + n[ny[n(OIMO = n] - £[tf(0|A''(0 = n]] 
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Let Y = 6ATit). To obtain a closed-form solution for the binomial gamma process, where the probability of death is 

n(f) = 1 - e^^^^^'\ we need E[e^^], V[e^^] and £[(1 - e^^)^], which we can get using the moment generating function 
E[e^^] = ijz^y * for z6t < 1 and h,A,T > 0. This gives after a Taylor expansion around h = 

E[e-^] = (1 + St)-'''^ 

= 1 -T"'ln(l +(5t)/! + 0(/!) 

V[e-^] = Ele-^^] - E[e-^f = (1 + 26t)-'''' - (1 + StT^''''' 

= (1 - T-^ ln(l + 25T)h + o{h)) - (1 - t'^ hi((l + 6Tf)h + o{h)) 

E[i\-e~^f] = 1 -2(1 -T-4n(l +(5t)/z + o(A)) + (1 -r"Mn(l +25t)/i + o(/i)) 

Plugging these results in the moment expressions above, 

E[ANit)\N(t) = nj = nr"^ ln(l + 6T)h + o(h) 
V[ANit)\Nit) = n] = nr"^ In (1 + 6T)h + 

'(1 +5t)2^ 



Since > 1 for 6t > 0, it follows that the process is over-dispersed for n > 1 and equi-dispersed for n = 1. □ 

Proof of Proposition 10 (negative binomial gamma process). In order to parallel the proof for the binomial gamma 
process of Proposition 9, we let the individual birth rate be p in this proof, while in the proposition it is represented 
by p. This way we can still use p for the gamma scale parameter. Since [N(t)] is a conditional linear birth process, the 
increment process is negative binomial with parameters number of successes n and success probabiUty n(;) = e~''^^'\ 
i.e 

P{AN{t) = k\N{t) = n, AT{t)) = [ " + ^ " ^ )n(?)» (1 - n(0)* , 



k 

for A: G N. Following a derivation parallel to that of the binomial ganmia processes of Proposition 9, 

00 

I/I5C) 



n + k-V * 



k 







Tia) 



k 



The limiting probabilities follow by a Taylor series expansion about h - Q hke in the proof of Proposition 9. The 
moments can be found as follows. Consider the odds against a birth ®{t) = given the probabiUty of a birth Il{t). 



Then 



E[ANit)\Nit) = n] = nE[@it)\Nit) = n] = n£[e^'^® - l\Nit) = n] 
V[ANit)\Nit) = n] = V[n@it)\N(t) ^n]+ E[n&(t)(l + &mN(t) = n] 

= E[ANit)\Nit) = n]+ n[nV[&it)\Nit) =n]+ E[&\t)mt) = n]] 
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Let as for the binomial gamma process Y = pATit), which follows a gamma distribution with mean ph and variance 
p^rh. To obtain a closed-form solution for the binomial gamma process, where the odds against a birth is 0(f) - 
gpAT© _ 1 Yve need E[e^], V[e^] and El{e^ - 1)^J, which we can get using the moment generating function = 
ijz^y ^ for z6t < 1 and h,A,T> 0. This gives after a Taylor expansion around h = 

= (l-pr)-"/^ 

= l-T-^Ml-pT)h + o{h) 

VW] = E[e^^] - E[e^f = (1 - 2pT)-''''' - (1 - pr)'^''''' 

= (1 - ln(l - 2pT)h + - (1 - ln((l - pT)^)ft + oik)) 

- 1)^] = 1 - 2(1 - r"^ hi(l - pT)h + oQi)) + (1 - r"' hi(l - 2pT)^ + o{hy) 

Note that we require that 2tp < 1. Plugging this into the moment expressions gives 
E\_AmW{t)n = n\ = nr"' ln(^— )/z + o(/j) 

^ 1 - pT' 

y[AAr(0|Ar(0 = n] = rvT^ \a(-^—\h + 

'^l - prr' 



+ nr-i[(n - 1) \ _2p^ )]^ + 



Since 775^ > 1 for pr > and 2tp < 1, it follows that the process is also over-dispersed for n > 1 and equi-dispersed 
for« = l. □ 

Proof of Proposition 11 (binomial beta process). Since {N{t)] is the counting process associated with a conditional 
linear death process, the increment process is binomial with parameters size h and death probability Tl{f), i.e. 

P{AN{t)=k\N{t)=n,Amf)) = ("|(n(0)*(l - n(0)""*, 

\k) 

fork e {0, 1, . ..,«}. 

We integrate out the beta noise using the fact that AA'^(;) conditional on A'^(;) = n has a beta binomial distribution 
with the corresponding parameters. 

The beta binomial probabiUty mass function of AN(t) given Nit) = n is 



P(AN(t) = k\N(t) = «) = 

lh\ Via + l3)Y{k + a)Y{n -k+P) 



T{a)T{JS)T{a+l3 + h) 
n\Y{a +p)T{a)T{J3)ma{^^^ + 0{h)} 



(A.3) 
(A.4) 



h\ nk)r(c + h~k) 
kj r(c + n) 

for k e {0, . . . ,n}. (A.4) follows from (A.3) via an apphcation of Lemma 18 in this appendix. Specifically, using 
Lemma 1 8 with i - h - k, it follows that 

r{n-k+/3) = i^^^^^:^^ + 0(h)}m, (A.6) 
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and, since a+/3 = c, 



T(a+/3 + h) = r(c + n) (A.7) 
r(c + n). 



Tic) 

r(c + h) 
r(c) 



r(c) 

r(a+j8). 



Plugging (A. 6) and (A.7) into (A. 3) gives (A. 4). Then, using a - c^h + o{h) and canceling terms gives (A. 5), which 
corresponds to the infinitesimal probabilities. 

The moments of a beta binomial distribution are a standard result. Since a = c{l - e'^'') and yS = ce'^'' and 
c = 2^-1 for n> 1, Taylor expansions around h = then give 

E[ANit)\Nit) =n\ = h " 



Qr+j8 

h6h + o(h) 

a/3 h + a + p 



V{mt)\N{t)=n\ = h ^„ 
ia+fiiY l+a+/3 

c + I 

= hSh (I + to) + o(h), 

for « > 1 and it follows that the binomial beta process is over-dispersed for w > 0. If h - 1 the process is equi- 
dispersed as 

V[ANit)\Nit) = n] = h 



hfih + o{h) 



□ 

Lemma 18. For a = c{\- e''*), yS = ce"**, c> and i € {1,2, .. .}, 

ros + - [^^^ + o(h)}m. 

Proof. Since = c - a, and by the definition of the gamma function, for i>l, 

T(J3 + i) = (c - or + (i - 1)) X (c - or + (j - 2)) X • ■ • X (c - or) X r(J3) 
= {{c + (i - 1)) X (c + (j - 2)) X • • • X (c) + Oih)}T(fi) 

i-l 

= {Y](c + j) + o{h)}m 

j=0 

□ 

Appendix B. A lemma required for Tlieorem 12 

This technical result is similar to, but slightly different from, standard results on Markov chains. 
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Lemma 19 (probability of single jump of any size in multivariate compound processes). Let [X{t)\ be a time homoge- 
neous, stable and conservative Markov counting system with associated multivariate counting process [N{t)} defined 
by (8) and (7), as in Theorem 12. Consider a starting time t and let U be the time between t and the first jump time 
and V be the time between t + U and the second jump time, where jumps can be of size one or more. Let S be the 

event of exactly one single jump in any of the [Nij(t) : i + j} counting processes occurring in [f, t + h\ Then, letting 
Xu = A{x) be the rate function of{X{t)} during [f, t +U] and, and Ay = A{X(t + U)) be this conditional rate function 
during [t+U,t+ V], 



where 



and 



P{S\X{t)^x,Kv) = Aue-'"^" (l){h) 
( h ifXu = Ay 

I 3;;=lV lfXu*Av 

P{S\X{t)=x) = hY,q,j{x,k) + oih) 

Proof. Start by fixing the random variable Ay at a given constant, say Ay. Given the Markov property, for the starting 
time t, the densities of the exponential inter-event times are fuiu) = Aus^"^" for m > and /y(v) = /lye""*" for v > 0. 
Then, 

P(S\X(t)=x) = PiU <h,U + V> h)^ PiU <h,V > h-U) 

h CO h CO 

= J J fu,viu,v)dvdu = J J Aue-"^"Aye~''^''dvdu 

h-u h-u 

h oo h 

= J AuQ"^''Aydu J e-"'*Vv = j Aue-"'^''e-'-''-"^'^^du 

h-u 

h 

= Aue-'''^'' J ^-"^'^"-■^''^du. (B.l) 



If the event rate is not changed by the first event happening (like it happens in a Poisson process but unhke Unear birth 
or death processes), then we can write Au = Ay = Ain (B.l) and 

PiS\Xit)=x,Ay) = Aue'^^^h (B.2) 
= Ah{l -Ah + o{h)) 

= Ah + o{h). (B.3) 



If /Ij/ 9t Ay, then from (B.l) 



P{S\X{t)=X,Ay) = Au&''^' 



1 _e-/".i.-.i>) 



Au — Ay 



(B.4) 



^£7 



Au — Ay 

1 - hAy + oQi) - 1 + hAu + oQi) 

Au — Ay 



Au - Ay 
Auh- — + o{h) 

Au ~ Ay 

Auh + o{h). (B.5) 
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Combining (B.2) and (B.4), replacing Ay by Ay and conditioning on Ay gives the fist result in the theorem. For the 
general case where Ay is stochastic, the limit in 

Um h-^PiS \Xit) = = lim h'^ElPiS \Xit) = x. Ay)] = E[limh-\Auh + oik))] = Au, 
can be passed inside the expected value since PiS \X{t) = x, Av)fAv ^ Iav ™d J f^yir) dr = I. □ 

Appendix C. Simple MCPs: subordination and multiplicative Levy white noise 

Let {Mi(/)) be the simple, time homogeneous, conservative and stable MCP with rate function /I : N ^ of 
Theorem 7. It will be convenient here to write M(A)(t) instead of M^if). Write ^^^fi^ih) for the integrated increment 
(or transition) probabilities of {M(A)(t)], defined as 

= P{^M(t) = k\M(t) = m). 
For {M(A)it)], Kolmogorov's Backward Differential System is satisfied [4], i.e. 

= [ - «W Virn). (CD 

This suggests the following definition of {M(A^)(t)}, a simple MCP {M(A)it)] with multiphcative continuous-time 
noise in the rate function, where {^(f)} = {dL{t)/dt] for a non-decreasing. Levy integrated noise process {L{f)} with 
L(0) = and E[Lit)] = as in Theorem 7. Define the process {M(A^)(t)} by 

<<^)(.).4<^f >(/.)! 

where n^^^(ft) is specified, by analogy to (C.l), as the solution to a stochastic differential equation 

<:if W = [ nX.C'^) - <n>^ l-^W dm, (C.2) 

or, essentially equivalently, 

<^fw = <<f^o).J[nS,(.-)-n-:f^.-)]A^^^ (C.3) 



To give meaning to (C.2) and (C.3), it is necessary to define a stochastic integral. Here, we use the Marcus canon- 
ical stochastic integral with Marcus map (t>(u,x,y) - TT^ll^^j^ix + uy). The Marcus canonical integral is a stochastic 
integral developed in the context of Levy calculus [2]. It is constructed to satisfy a chain rule of the Newton-Leibniz 
type (unUke the Ito integral). In the case of continuous Levy processes, the Marcus canonical integral becomes the 
Stratonovich integral. For jump processes, the Marcus canonical integral heuristically corresponds to approximating 
trajectories by increasingly accurate continuous piecewise linear functions. We interpret (C.2) as a stochastic version 
of (C.l). We then think of n^^^^'(/z) as stochastic transition probabilities, conditional on the noise process, giving rise 

to deterministic transition probabilities ^'^l^ih) once this noise is integrated out. 
Proof of Theorem 7 (Levy white noise and subordination). By definition 

Applying Theorem 4.4.28 of Applebaum [2] with f{L{h)) = n^^^^{L(h)), it follows that / e C\W) by smoothness of 
/ implied by (C.l) and then, since Tl^^^fJi^iO) = 0, that 

h 

<r.(^W) = / Kfa^)) - nZ[{Lir-))]Aim) dL(r), 
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so that U^^^i^(L(h)) satisfies (C.3). Given uniqueness and existence of (C.3), it follows that 

and hence that n^_^^°\h) = □ 
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